A Genetic k-Modes Algorithm for Clustering Categorical Data

نویسندگان

  • Guojun Gan
  • Zijiang Yang
  • Jianhong Wu
چکیده

Many optimization based clustering algorithms suffer from the possibility of stopping at locally optimal partitions of data sets. In this paper, we present a genetic k-Modes algorithm(GKMODE) that finds a globally optimal partition of a given categorical data set into a specified number of clusters. We introduce a k-Modes operator in place of the normal crossover operator. Our analysis shows that the clustering results produced by GKMODE are very high in accuracy and it performs much better than existing algorithms for clustering categorical data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Optimization K-Modes Clustering Algorithm with Elephant Herding Optimization Algorithm for Crime Clustering

The detection and prevention of crime, in the past few decades, required several years of research and analysis. However, today, thanks to smart systems based on data mining techniques, it is possible to detect and prevent crime in a considerably less time. Classification and clustering-based smart techniques can classify and cluster the crime-related samples. The most important factor in the c...

متن کامل

Genetic Distance Measure for K-modes Algorithm

K-means algorithm has been shown to be an effective and efficient algorithm for clustering. However, the k-means algorithm is developed for numerical data only. It is not suitable for the clustering of non-numerical data. K-modes algorithm has been developed for clustering categorical objects by extending from the k-means algorithm. However, no one applies this technique for classification of c...

متن کامل

A Variant of Genetic Algorithm Based Categorical Data Clustering for Compact Clusters and an Experimental Study on Soybean Data for Local and Global Optimal Solutions

Almost all partitioning clustering algorithms getting stuck to the local optimal solutions. Using Genetic algorithms (GA) the results can be find globally optimal. This piece of work offers and investigates a new variant of the Genetic algorithm (GA) based k-Modes clustering algorithm for categorical data. A statistical analysis have been done on the popular categorical dataset which shows the ...

متن کامل

خوشه‌بندی خودکار داده‌های مختلط با استفاده از الگوریتم ژنتیک

In the real world clustering problems, it is often encountered to perform cluster analysis on data sets with mixed numeric and categorical values. However, most existing clustering algorithms are only efficient for the numeric data rather than the mixed data set. In addition, traditional methods, for example, the K-means algorithm, usually ask the user to provide the number of clusters. In this...

متن کامل

A genetic fuzzy k-Modes algorithm for clustering categorical data

The fuzzy k-Modes algorithm introduced by Huang and Ng [Huang, Z., & Ng, M. (1999). A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems, 7(4), 446–452] is very effective for identifying cluster structures from categorical data sets. However, the algorithm may stop at locally optimal solutions. In order to search for appropriate fuzzy membership matrices...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005